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Dear Sir: 



1 . I, L, Alison Mclnnes, declare and say I am a co-inventor of the claims of the above-identified 
patent application. I directed others and personally performed the research leading to the mventiondisclosed and 
claimed therein. 

2. I have read the Office Action dated April 24, 2001 in this application and understand that the 
Examiner has rejected pending claims 1 - 1 2 and 25-27 on the basis that the specification is not enabling for the 
full scope of the claims. 



3 . The data presented below show that, using techniques described in the specification, at least five 
new polymorphisms, including single nucleotide polymorphisms (SNP), were identified in the narrow interval on 
chromosome 1 8p described in the application, which polymorphisms are associated with bipolar mood disorder 
(BP). Thus, in addition to the polymorphisms already identified in the patent application, and using the guidance 
provided in the application, several additional polymorphisms were identified that are associated with BP. 
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Association of polymorphisms with BP in a Narrow Interval on Chromosome 18p 
as Identified in the Instant Application and Corroborated by Subsequent Work 

4. The instant application provided data showing a positive LOD score for a D18S59 allele with 
BP in a pedigree analysis; and gave evidence of an association of D18S59 with BP in a population study. The 
instant application further showed a positive LOD score for a D18S476 allele with BP in a pedigree analysis and 
"gave evidence of association in population studies. In a subsequent study of linkage disequilibrium (LD) on 
chromosome 1 8 in a population sample of 69 BP-I patients from the Central Valley of Costa Rica (CVCR), the 
sameD18S59 allele was associated with BP-I. Escamillaetal. {l999)Am J.Hum. Genet. 64: 1670-1678; acopy 
of which is provided herewith as Exhibit 2. 

5. Further genotyping of the 69 affected individuals using four publicly available microsatellite 
markers delineated a segment of maximal LD with BP-I, covering about 33 1 Kb. Evaluation of a larger sample 
(227 patients and relatives, and 26 independent control trios) using these markers showed continuing evidence of 
LD and haplotype sharing in this sample for this region. EscamUla et al. (2001) Am. J. Med. Genet. 105:207- 
213; a copy of which provided herewith as Exhibit 3. 

6. Thus, the instant application provides evidence of association of at least two polymorphisms 
associated with BP. This association was corroborated by work published after the filing date of the instant 
application. These markers are in a narrow interval between SAVA5 and ga203 on chromosome 1 8p. Within 
this region, a segment of about 331 kb, and having maximal LD with BP, was further delineated. 



At Least Five Additional Polymorphisms Associated with BP were Found in the 
Previously Identified Narrow Interval 

7. Using techniques described in the instant apphcation, at least five additional polymorphisms 
were identified that are located within the narrow interval between SAVA5 and ga203, and that are associated 
withBP. 

8. As described in detail below, four new microsatellite markers, and 26 new single nucleotide 
polymorphisms (SNPs) were identified in the narrow interval on chromosome 1 8p. The results of LD analysis of 
these 30 new markers, as well as four previously identified microsatellite markers, are displayed in Table 1. Of 
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the 34 markers presented in Table 1, 16 showed association (X > 0) with BP in at least one of the two samples. 
The p-value for five of these 16 markers was < 0.01. All five of these markers (PH84, PH205, PH202, PH208, 
and TS30) had estimates of X near 1 .0, indicating that virtually all affected individuals had at least one copy of the 
associated allele. 



METHODS 
Sample collection 

9. Two samples were analyzed. In one sample, the patient sample was composed of 227 CVCR 
BP-I individuals (including the set of 69 patients from EscamillapOOl) that gave the original association 
evidence in 18p) and their available first degree relatives (total N=563). All affected individuals had at least two 
psychiatric hospitalizations with the first hospitalization by age 50. A second sample was comprised of these 563 
individuals and a set of controls (52 unrelated parents of students recruited from the University of Costa Rica who 
were selected for CVCR ancestry [at least 5 out of 8 great-grandparents from the CVCR]). 

Radiation hybrid and STS-content mapping of markers within the candidate interval 

10. Genetic and physical mapping information was initially obtained from Whitehead Institute 
for Biomedical Research/MIT Center for Genome Research, Stanford Human Genome Center, GENETHON 
Human Genome Research Center, and the Cooperative Human Linkage Center. Radiation hybrid (RH) 
mapping was used extensively in the early phase of this study to resolve discrepancies in marker order 
between maps. Specifically, the 83 Stanford G3 radiation hybrid panel was used to map all genetic and STS 
markers available from public database as well as those developed specifically for the project. In addition to 
RH mapping, STS-content mapping using BAC (Bacterial Artificial Chromosome) clones from the region of 
interest was also used routinely to determine the marker order and to complete the BAC contig. 



BAC library screening, end sequencing and contig building 

1 1. Microsatellite and STS markers obtained from public databases were used to screen the 
human BAC library from Research Genetics (Huntsville, AL) by PGR or to the BAC library from Genome 
Systems (St Louis, MO) screen by hybridization according to manufacturers' protocols. BAC DNA from 
positive clones was prepared, and sequences of the BAC ends were obtained by cycle sequencing the BAC 
DNA directly with vector primers T7 and SP6, respectively. PCR primers were designed from non-repetitive 
end sequences and used as STS markers to improve the physical map and the BAC contig construction. The 
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outlying markers from each side of the contigs were used to screen for overlapping BAC clones to extend the 
contigs. 

Construction of randomly sheared libraries from BACs 

12. BAC DNA was sheared to small fragments of desired size range using a nebulizer After 
* shearing, the libraries were constructed using established techniques. 

Microsatellite and SNP marker development and genotyping 

13. Microsatellite markers were generated by hybridizing oligonucleotide probes for di, tri, and 
tetranucleotide repeats to randomly sheared sub-libraries made from BAC clones using Quicklite non-isotopic 
enzyme induced chemiluminescent reagents from Lifecodes Corp. (Stamford CT) following the 
manufacturer's instructions. Positive clones were sequenced to identify microsatellite sequences and primers 
were then designed from flanking unique DNA sequence. Primers for amplifying STS markers were also 
designed using BAC end sequences, and random sequences available within the candidate interval when 
extensive sequencing of the randomly sheared libraries were done. Primer sequences are publicly available at 
PNAS Online. 

14. We genotyped the 4 new microsatellites identified by us in sequencing the region. Primer 
sequences axe available on request. Genotyping procedures for the microsatellites were performed using 
established techniques. 

15. Single nucleotide polymorphisms (SNPs) were identified using SSCP (Single Strand 
Conformational Polymorphism) analysis of STS markers (all < 300 basepairs in length), using established 
techniques. We used four unrelated individuals to screen for each SNP. We genotyped the SNPs in patient 
and control samples using standard SSCP procedures. 

Sequencing of the candidate interval and identification of the candidate genes 

16. In the interval of < 3 cM, located within the SAVA5-ga203 interval, randomly sheared 
libraries prepared from BACs covering this region were sequenced at 10X coverage to discover all sequence 
information and identify all genes within the interval. More than 10,000 individual sequences from the region 
were compared by BLAST20 with sequences from publicly available databases and were analyzed using 
GRAIL21 to identify potential coding sequences. In addition, sequences were assembled using PHRAP 22, 
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23, 24 in a single DNA strand of -33 1 Kb. The whole sequence was again analyzed using BLAST and 
GRAIL to aid in gene prediction. These data were displayed in ACEdb (data available from 
ncbi.nlm.nih.govj to visualize predicted exons and their relationships to each other. 

Statistical analyses 

17. We applied a modified version of Terwilliger's likelihood ratio test of LD to the 4 novel 
microsatellites and 26 SNPs that spanned our 33 1 Kb candidate region. For each of these 3 0 markers we applied 
this test twice, once in the sample of 227 patients and their available relatives, and also with the addition of the 
independent controls to the 227 patients and relatives. This likelihood ratio test estimates a single parameter, \ 
wlych quantifies the overrepresentation of an associated marker allele on disease chromosomes versus control 
chromosomes. X is related to the common epidemiological parameter of population attributable risk. If the 
frequency of an associated allele on disease and normal chromosomes is given by p D and pv, respectively, then X 
is calculated by (p D - Pn)K 1 ~ Pn\ Only positive associations with disease are permitted, and X ranges &om 0 
(under the null of no association) to 1.0 (all disease chromosomes carry the associated allele). Others have shown 
that X is the most closely related to the recombination fraction with disease and less influenced by marker allele 
frequencies than other measures of LD. Because we do not know which chromosome of an affected individual 
harbors the disease locus, we incorporated a genetic model of disease transmission in the procedure of Terwilliger. 
Using this model also enabled us to employ data from additional family members other than parents, if they were 
not available. The same genetic model (mostly dominant with reduced penetrance) was used as in our previous 
LD papers and in the genome screen of the Costa Rican pedigrees described in Mchines et al. In this model one 
chromosome of the affected individual is used as a control chromosome. The use of a model is likely to increase 
the power of the test and the precision of the estimates of X when the inheritance pattern is approximately known 
Using simulated data, Terwilliger shows that his test is conservative. 

Results 

Marker development and physical map 

18, Based on our previous results (as described in the instant patent application; and in the 
publications provided herewith) we focused marker development and physical mapping efforts (including 
direct sequencing) in the <3 cM region between sAVA5 andD18S1231. Within this region we identified 4 
new microsatellite markers and 26 SNPs to add to the 4 publicly available microsatellite markers already used 
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(see Exhibit 3). Based on the extent of haplotype sharing in pedigree CR001 and LD results from the 
previously used markers, we focused our detailed investigation on the region of about 33 1 Kb between PH33 
and D18S123 1 (although in public databases this segment is estimated as being 378 Kb in length, contig 
NTJ)1 1005). Using several sequence analysis tools and database mining procedures (see Methods, above), 
we determined that this interval contained six known genes (CENTR1N, CLUL1, TYMS, rTS t YES1, and 
ADCYAP1, ordered from telomeric to centrcttneric, with TYMS and rTS overlapping each other). This order 
differs in the public database (CEN77UN, CLUL1, YES1, rTS,TYMS, andADCYAPl, with no overlap 
between rTS and TYMS). All of the genes except "clusterin-like 1 (retinal)" gene [CLVLJ] have been well 
characterized previously. CLUL1 was originally identified during a screen of a human retinal cDNA library 
for retina-specific genes. The function of this gene is not known; however Northern blot analysis reveals that 
it is highly expressed in retina with much lower yet detectable expression in several other tissues including 
brain, kidney and testes. 

Genotyping results 

19. We genotyped the 30 newmarkers in pedigree CR001 andin the CVCR patient and control 
samples. Results of the LD analysis for these markers (and the four previously available markers reported in 
ref 8) are displayed in Table 1 (provided herewith as Exhibit 4). Of the 34 markers presented in Table 1, 16 
showed association (X >0) with BP-I in at least one of the two samples (that with 227 patients/relatives and 
that with 227 patients/relatives and the addition of 52 controls). The p-value associated with the estimate of 
X was <0.01 for five of these 16 markers, and for four of the five markers the magnitude of association was 
greater in the sample containing the population controls. All five of these markers (PH84, PH205, PH202 5 
PH208, and TS30), had estimates of X near 1.0, indicating that virtually all affected individuals had at least 
one copy of the associated allele. The markers showing LD are clustered in the 19 Kb segment between exon 
8 of CLUL1 and exon 1 of TYMS. This segment also contains the minimal region of haplotype sharing 
within CR001, and for each marker in this segment, the associated alleles seen in the population samples are 
the same alleles in the shared haplotype in CR00 1 (last column in Table 1). 
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Summary 

20. The data presented herein extend the findings described in the instant patent application. The 
patent application provided evidence, from both pedigree analyses and population studies, that a number of 
polymorphisms, including a 154 bp allele of the microsatellite marker D18S59 and a 271 bp allele of the 
microsatelliie marker D18S476, are associated with BP. The patent application described how to identify 
additional markers, and how to determine whether such markers are associated with BP. 

2 1 . The data presented herein show that, using techniques described in the patent application, several 
new polymorphisms, located in the previously identified interval and associated with BP, were identified. 
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Conclusion 

22. Those in the field, given the guidance in the instant patent application, could identify 
additional polymorphisms associated with BP. 

23. I hereby declare that all ^statements made herein of my own knowledge are true and 
that all statements made on information and belief are believed to be true; and further that these 
statements were made with the knowledge that willful false statements and the like so made are 
punishable by fine or imprisonment, or both, under Section 1001 of Title XVIII of the United States 
Code, and that such will false statements may jeopardize the validity of the application or any patent 
issuing thereon. 



Date 





Enclosures: Exhibits 2-4 
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Microsatellite Analysis 

5. J, Payne 



4.1 Introduction 

In the space of 5 years since the first report of their informativeness, thousands of microsatel- 
lites have been characterized and a high-resolution genetic linkage map of the human genome 
built. Microsatellites have been key tools in tracking disease genes both in clinical and 
research laboratories. Short tandem repeat loci (STRs) are used in forensics, identity testing, 
and in analysis of population structure (this will increase as the Human Diversity project 
expands). STR markers are abundant, highly polymorphic and technically very simple to ana- 
lyze. It is not too big a cliche* to say that microsatellites have revolutionized genetic analysis. 
The first DNA polymorphisms to be exploited in genetic linkage studies were single 
base-pair sequence variations which occurred within restriction endonuclease recognition 
sites. These polymorphisms could be easily detected because variant sequences either created 
or abolished enzyme recognition sites and therefore resulted in restriction fragments of vari- 
able length. Botstein et al.' proposed constructing genetic maps using restriction fragment 
length polymorphisms (RFLPs), . Although RFLPs are widely distributed throughout the 
genome, their utility is limited by low informativeness. Since most RFLPs are only dimorphic 
(either the enzyme cuts or it doesn't), the maximum heterozygosity of 50% can only occur 
when both alleles are equally represented in a population— most RFLPs have lower heterozy- 
gosities. 

Concurrently with the development of RFLPs, another class of DNA polymorphism was 
characterized based on tandem arrays of repeated sequences. 2 " 4 Tandem sequence repetition 
is widespread in eukaryotic genomes and many types of repeat motif have been described. 
One common feature of repetitive sequence loci is that the number of repeat units differs 
between individuals, giving rise to arrays of variable length. Polymorphic markers based on 
variable numbers of tandem repeats (VNTRs) are potentially very informative because of the 
large number of alleles which may exist. The most polymorphic VNTRs ("minisatellites") 
have repeat units of between 1 2 and 60 or more base-pairs and a total array size of 0.5 to over 
3 kb. The major limitation of minisateilite VNTRs is that they tend to be clustered at telom- 
eres and are therefore of restricted value in constructing complete human genome maps. 4 

In the early 1980s a sub-class of repetitive loci were described with a repeat unit of only 
two base pairs— so-called "rnicrosateJIites" 5 -* It was not, however, until 1989 that the poly- 
morphic nature of microsatellites was recognized*' 10 As with larger VNTRs, microsatellites 
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vary between individuals in the number of repeats in the array. Their nomenclature is infor- 
mal and such loci are variously referred to as STRs, variable small sequence markers 
(VSSMs), simple sequence repeats (SSRs), dinucleotide repeats, CA blocks etc. The repeat 
unit may be from I to 6 bp and the most common mierosatellite repeat motifs are A, AC, 
AAAN, A AN, AG, and AT 11 although the best characterized are dinucleotide (dC-dA/dG-dT) 
repeats. Microsatellites are extremely abundant, occurring with an estimated average fre- 
quency of one STR every 6 kb of human genomic sequence." Microsatellites have clear 
advantages over the other polymorphisms described above. STRs often have multiple alleles 
and many have heterozygosity frequencies of 70% or more making them highly informative 
for genetic analysis. In addition, the loci are small enough to be analyzed using the poly- 
merase chain reaction (PCR). t2 * 13 The significance of these factors was quickly recognized 
and microsatellites soon became markers of choice for many applications. 

4.1.1 Informativeness of Microsatellites 

The informativeness of a polymorphic marker depends upon the number of alleles and their 
relative population frequencies. In the context of genetic linkage studies (for example, predictive 
linkage analysis in a family with a genetic disease), the informativeness of a linked marker relates 
to the likelihood that the parental genotypes can be deduced following analysis of a child of an 
affected parent Botstein et al. 1 described the polymorphism information content (PIC) which is 
a statistical assessment of informativeness of a marker. In order to evaluate a marker for PIC, 
firstly the frequencies of all possible genotypes for a given marker in a population and the fre- 
quencies of all rnating-type combinations are estimated. Next, the probability of iriformativeness 
in offspring of each mating-type combination is calculated. Finally, a value for PIC is obtained 
by summing the rnating-type frequencies multiplied by the probability of informative offspring. 

Marker informativeness is more easily estimated by simply counting the number of het- 
erozygotes in a suitably large sample set. PIC approximates to the observed frequency of het- 
erozygosity. The greater the number of alleles at a given locus (and the more even the spread of 
allele frequencies in a population), the more informative will be the marker. This underlies the 
virtue of microsatellites in linkage analysis and gives measure to the extent to which microsatel- 
lites are much more informative than dimorphic systems such as RFLPs. 

4.2 Applications 

4.2. 1 Construction of Genetic Maps 

Genetic maps are constructed by linkage analysis. Linkage relationships (map order and 
distance between markers) are established by typing a collection of families with the markers 
of interest. Mapping information is obtained by detecting recombination between markers. 
Linkage analysis has been successful in mapping genes for a great number of inherited condi- 
tions as the first step in a positional cloning strategy. The disease itself is treated as polymor- 
phic marker with alleles "mutant" and "normal." This clearly relies upon accurate clinical 
assessment of recombinant individuals in affected families. 

4.2.2 Disease Gene Tracking 

Clinical molecular genetics laboratories make use of linked marker to perform predictive 
or presymptomatic testing for at-risk individuals in affected families. Disease genes are 
tracked through families by analyzing inheritance of markers known to be closely-linked to 
the disease. Ffgure 4. 1 shows an example of use of a STR to track an autosomal dominant, late 
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Summary 

Linkage disequilibrium (LD) analysis has been promoted 
as a method of mapping disease genes, particularly in 
isolated populations, but has not yet been used for ge- 
nome-screening studies of complex disorders. We present 
results of a study to investigate the feasibility of LD 
methods for genome screening using a sample of indi- 
viduals affected with severe bipolar mood disorder (BP- 
I), from an isolated population of the Costa Rican cen- 
tral valley. Forty-eight patients with BP-I were genotyped 
for markers spaced at ~6-cM intervals across chromo- 
some 18. Chromosome 18 was chosen because a pre- 
vious genome-screening linkage study of two Costa Ri- 
can families had suggested a BP-I locus on this 
chromosome. Results of the current study suggest that 
LP methods will be useful for mapping BP-I in a larger 
sample. The results also support previously reported 
possible localizations (obtained from a separate collec- 
tion of patients) of BP-I-susceptibility genes at two dis- 
tinct sites on this chromosome. Current limitations of 
LD screening for identifying loci for complex traits are 
discussed, and recommendations are made for future 
research with these methods. 
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Introduction 

Identifying genes for disorders with complex inheritance 
patterns is one of the greatest challenges in biomedical 
research (Lander and Schork 1994). Such disorders, 
which include many of the most prevalent human dis- 
eases, are difficult to map with standard linkage meth- 
ods. It has been suggested that the availability of dense 
marker maps-covering the genome will make linkage 
disequilibrium (LD) analysis a feasible approach for 
screening the genome to map complex disorders (Risch 
and Merikangas 1996). Current marker maps are not 
sufficiently dense to enable such studies to be performed 
in heterogeneous populations or in populations that 
were founded in the distant past. However, the success 
of genome-screening LD-mapping studies of genetically 
simple and/or rare diseases in recently founded is^^^ 
populations (Houwen et al. 1994; Puffenberger et al. 
1994; Friedman et ak 1995; Newport et al. 1996) pro- 
vide the impetus for testing the utility of LD methods 
for mapping complex diseases in such populations (Es- 
camilla et al. 1996). In populations where randomly 
sampled patients are on average <20 generations re- 
moved from their last common ancestor, LD may be 
maintained for sizable regions around disease genes. 
Such LD should be manifested by affected individuals 
sharing alleles, identical by descent (IBD), at markers 
spaced at intervals of several centimorgans surrounding 

* These authors contributed equally to this work. 

Present affiliation: Department of Psychiatry, University of Texas 
Health Science Center at San Antonio, San Antonio, Tex*^- 

Present affiliation: Department 'of Psychiatry, Nara Medical 
School, Nara, Japan. ■ . 

'Dr. Gallegos died after this paper was accepted for publication. 
This paper is dedicated to his memory. 
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a disease gene. We now present the results from the first 
stage of a study in which LD methods were used to 
screen for loci that predispose to severe bipolar mood 
disorder (BP-I), which is common and is almost certainly 
characterized by a complex mode of inheritance. The 
study was done in a relatively recently founded isolated 
population, that of the central valley of Costa Rica 
(CVCR) (Escamilla et al. 1996), where founder effects 
have already been observed for several inherited diseases 
(Saborio 1992; Uhrhammer et al. 1995; Shah et al 
1997). 

Despite long-standing evidence that BP-I has a genetic 
basis (Escamilla et aL 1997), genome scans for linkage 
have provided equivocal results (Risch and Botstein 
1996; Nurnberger et aL 1997) that fail to satisfy the 
levels of significance suggested for genomewide screens 
by Lander and Kruglyak (1995). The failure to identify 
BP-I loci definitively, by standard linkage approaches, 
probably reflects uncertainty regarding mode of inheri- 
tance, high phenocopy rates, difficulty, in demarcation 
of distinct phenotypes, and presumed genetic heteroge- 
neity. LD-based mapping approaches within population 
isolates may offer a means of diminishing several of these 
obstacles. An approach (such as LD mapping) that sam- 
ples individuals from an entire population can more eas- 
ily ascertain a large set of patients with a narrowly de- 
fined, reliably diagnosed phenotype (in this case, BP-I) 
than linkage-based approaches that require ascertain- 
ment of family units with multiple affected cases. Within 
a population isolate, genetic heterogeneity of BP-I may 
also be less than in larger, genetically mixed populations, 
as there is a high probability that individuals with such 
a phenotype share descent from a few common 
ancestors. 

We collected a sample of patients with BP-I, for LD 
analysis., by identifying individuals currently living in the 
CVCR who had known CVCR ancestry. This sample 
was collected independently of our previous pedigree- 
based studies of BP-I in Costa Rica. Our aim in the 
current study was to evaluate the feasibility of identi- 
fying BP-I loci by LD screening in this population, as 
proposed in Escamilla et al. (1996). To do this, we con- 
ducted an LD screen of an entire single chromosome 
(chromosome 18). This chromosome was chosen be- 
cause previous linkage studies in Costa Rica and in other 
populations suggested that it possibly contained bipolar 
disorder loci (Berretini et al- 1994; Stine et al. 1995; 
Freimer et al. l?96<z). Genealogical studies indicated that 
the individuals in our current study did not share com- 
mon ancestry over the past several generations (Escam- 
illa et al. 1996). We therefore anticipated that we would 
not detect random genome regions shared IBD by more 
than a few individuals and that regions of high IBD 
sharing would thus be areas containing possible BP- 
I-susceptibility genes inherited from a common founder. 
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Samples and Methods 

Sample Collection 

To diminish the likelihood of investigating phenocop- 
ies, we limited the sample to individuals with a definite 
diagnosis of BP-I, with onset by age 50 years and a 
history of at least two psychiatric hospitalizations. The 
48 patients with BP-I (25 female patients and 23 male 
patients) in the current study were recruited indepen- 
dently from psychiatric hospitals and clinics in the 
CVCR. First-degree relatives of patients were also re- 
cruited, to determine genetic phase. The study was ap- 
proved by institutional review boards at the Costa Rican 
Ministry of Health, the University of Costa Rica, and 
the University of California at San Francisco, and in- 
formed consent was obtained from all participating sub- 
jects. Of the 48 BP-I subjects, 8 individuals had both 
parents available for genotyping, 20 individuals had one 
parent available, 10 individuals had one or more chil- 
dren available, 1 individual had two siblings available, 
and 9 individuals had no relatives available. In nuclear 
families, only one individual (the proband) was desig- 
nated as affected, and all others were considered to have 
unknown phenotype. Details of ascertainment and di- 
agnostic procedures, and the clinical and genealogic pro- 
files of the study sample, can be found in Escamilla et. 
al. (1996). 

Genotyping 

We used 26 markers, spanning chromosome 18, to 
genotype all 48 affected individuals (as well as 53 rel- 
atives, to establish phase). Of the 25 regions, 21 were 
cM, and 4 were 6-7 cM. The average distance be- 
. tween markers was 4.8 cM. When choices were avail- 
able, we' chose the most polymorphic marker (Gyapay 
et al. 1994). The average heterozygosity of the markers 
used in this screen (in the CEPH pedigree collection) was 
0.75. (The only screening markers with heterozygosity 
values <0.70 were D18S464, D18S60, D18S378, and 
D18S469.) We screened chromosome 18 at a marker 
density of 6 cM because available marker maps had gaps 
cM, and our goal was to have an equal density of 
coverage across the chromosome (Gyapay et al. 1994). 
We chose markers from the maps available, at the time 
of the study, from Genethon (Gyapay et al. 1994), the 
Cooperative Human Linkage Center (Murray et aL 
1994), and the public database of the Utah Center for 
Genome Research. Genotyping procedures used for all 
experiments were as previously described by Di Rienzo 
et al (1994). In Brief, one of the two primers was labeled 
radioactively with a polynucleotide, kinase, and PCR 
products were separated, by electrophoresis, onto poly- 
acrylamide gels. Auto radiographs were scored indepen- 
dently by two raters. Data for each marker were entered 
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into the computer database twice, and the resultant files 
were compared for discrepancies. Scoring was done 
without knowledge of affected status. 

Simulations 

We conducted simulations to evaluate the power of a 
likelihood-based test of LD (Terwilliger 1995), to detect 
a result significant at the .05 level, with these assump- 
tions: a 6-cM marker map; a disease gene in the middle 
of the 6-cM segment; affected subjects, with one copy 
of the disease gene, separated by 10 generations from a 
common ancestor; and four equally frequent marker al- 
leles at each marker site. (The disease gene was associ- 
ated with the * 1 * allele at the marker locus*) Under these 
assumptions, and with a phenocopy rate of 0%, normal 
chromosomes carried each marker allele with a proba- 
bility of 25% (normal-chromosome distribution), and 
disease chromosomes carried the "1" allele with a prob- 
ability of 80%. The probability of disease-chromosome 
distribution was calculated with the formula (1 — 
9) G + [1 - (1 -6) G x f] y where 9 '« recombination 
fraction, G = number of generations from, a common 
ancestor, and f~ the frequency of the allele in the pop- 
ulation. Thus, the disease chromosomes carry the "1" 
allele with a probability of 80% and each of the re- 
maining three alleles with a probability of 6.7%. Because 
the true genetic structure of bipolar disorder is unknown, 
we examined several different conditions of etiologic het- 
erogeneity (which would include locus and allelic het- 
erogeneity, as well as phenocopies). We investigated phe- 
nocopy rates of 0%, 33%, and 67% (with phenocopy 
rates of 33% and 67%, the percentages of chromosomes 
from affecteds with the "1 M allele are 62% and 43%, 
respectively). If an affected individual was randomly se- 
lected as a phenocopy (with a probability equal to the 
phenocopy rate), then the marker allele on all four pa- 
rental chromosomes was randomly chosen from the nor- 
mal chromosome distribution. If the affected individual 
was randomly chosen as a true case, (with a probability 
of 1 minus the phenocopy rate) the marker allele for 
one chromosome of that individual was randomly cho- 
sen from the normal chromosome distribution, and the 
other chromosome's marker alleles were randomly cho- 
sen from the disease-chromosome distribution. Recom- 
bination occurred on parental chromosomes in propor- 
tion to the marker map. Marker alleles for 
nontransmitted chromosomes of the parents were ran- 
domly chosen from the normal chromosome distribu- 
tion. We performed these analyses by using the 48 pa- 
tients with BP-I plus their available relatives. One 
hundred replications were performed for each simula- 
tion. Available relatives were considered to have un- 
known disease phenotype. For the 10 affected individ- 
uals with at least one child available for genotyping, one 
chromosome from the affected parent was randomly 



simulated to be transmitted to available children, and 
the other chromosome was randomly selected from the 
normal chromosome distribution. Although data were 
simulated for parents of all affected individuals, if par- 
ents were not available for genotyping, their simulated 
genotypes were not used in these analyses. 

We also did power simulations (100 replications for 
each model) of larger sample sizes, using an ideal situ- 
ation in which both parents are available for genotyping, 
to aid in planning future studies. In these simulations 
we used sample sizes of 90, 200, 300, and 400 affected 
individuals; phenocopy rates of 50% and 75%; and a 
marker map of 2.5 cM, with all other assumptions as 
described above. With this denser marker map, at a phe- 
nocopy rate of 0%, disease chromosomes carried the 
"l n allele with a probability of 90%, calculated by the 
formula (1 ™ 6f 4- [1 - (1 - B) G * f] 9 and each of the 
remaining three alleles with a probability of 3.3%. De- 
tails of the likelihood-ratio test used in analyzing sim- 
ulation results are described inAnalysis, 

Analysis 

We used. two different procedures to identify regions 
potentially shared IBD by patients with BP-L The first 
approach, a search for shared segments, has the advan- 
tage of being nonparametric. The second approach, al- 
though requiring parameters of the illness to be specified, 
has the advantage of providing a formal test statistic, 
allowing for the calculation of P values. These two tests 
thus offer compensatory strengths and weaknesses when 
used in the search for genes in a complex disease. 

We first searched for shared segments (Houwen et al. 
1994), For each individual, we evaluated two marker 
haplotypes in each of the 25 intermarker intervals, by 
using a preselected threshold (the possible sharing of a 
haplotype by 5*50% of patients) to select segments for 
further investigation. Since this screen does not differ- 
entiate between sharing that is IBD and sharing that is 
identical by state (IBS), use of lower thresholds would 
lead to too many segments passing the screen. 

We also applied a likelihood-ratio test for LD to each 
of the 26 initially tested markers. This test was done 
independently of the results of the shared-segment eval- 
uation. We applied a modified version of the procedure 
of Terwilliger (1995), which only includes case and con- 
trol chromosomes or chromosomes transmitted and not 
transmitted to patients. In our sample there were several 
affected individuals whose parents were not available 
but whose children were available. DNA from these lat- 
ter individuals could not be analyzed with the original 
Terwilliger program but could be analyzed with our im- ■ 
plementation of the same procedure, as described by 
Freimer etal. (1996a). This procedure examines the like- 
lihood that a particular allele (or alleles) is (are) over- 
represented on disease chromosomes compared with 
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Table 1 

Heterozygosity of 
Chromosome 18 



Markers Used in the Genome Screen of 



Marker Name 



Heterozygosity in 
Genethon Database 



Heterozygosity in 
Costa Rtcan Sample 4 



D18S1140 

Dl8S59 b 

Dl8S476 b 

D18S481 

D18S391 . 

D18S452 

D18S843 

D13S464 

D18S1153 

D18S378 

DX8S53 

D18S453 

D18S40 

DISS66 

D18S56 

D18S57 

D18S467 b 

D18S460 

D18S450 

D18S474 

D18S69 

D18S64 

D18S1134 

D18S1147 

D18S60 

D18S55 

D18S68 

D18S477 

D18S61 b 

D18S488 

D18S485 b 

D18S541 

D18S870 b 

D18S469 b 

D18S874 

D18S380 

Dl8S1121 b 

D18S1009 

D18S844 

D18S554 

D18S461 

D18S70 



.49 
,81 
.76 
-.76 
,75 
.83 
NA 
.65 
.78 
NA 
.79 
.82. 
NA 
.85 
.73 
.87 
.73 
.62 
.79 
.82 
.79 
.74 
.73 
.85 
.37 
.77 
.79 
.62 
.87 
.87 
.79 
NA 
NA 
,65 
NA 
NA 
.74 
.74 
NA 
.82 
.77 
.83 



.39 

.81 

.62 

.74 

.69 

.85 

.73 

.51 

.69 

.54 

.81 

.81 

.81 

.81 

.74 

.85 

.64 

.67 

.74 

.73 

.78 

.65 

.68 

.86 

.58 

.80 

.79 

.70 

.86 

.82 

.79 

.63 

.66 

.64 

.64 

.63 

.77 

.66 

.76 

.79 

.65 

.86 



Note.— NA ~ data not available, 

* Allele frequencies were calculated from the entire sample, ac- 
counting for known relationships among individuals. 
b Markers with -2l*(LR) >l-0. 

nondisease chromosomes (Terwilliger 1995; Freimer et 
al. 1996a), A'single parameter, X, is estimated, which 
quantifies such overrepresentation of marker alleles on 
disease chromosomes. Designation of chromosomes of 
probands as disease carrying or non-disease carrying 
was achieved by specification of a genetic model for the 
disease. The same model of transmission was used in 
this LD-likelihood test as was used in the initial genome 
screen of the Costa Rican families, described in Mclnnes 
et aL (1996). In brief, this model assumes that the disease 
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is nearly dominant (assuming penetrance of .81 for het- 
erozygous individuals and .90 for homozygous individ- 
uals with the disease mutation), that the population 
prevalence of BP-I in Costa Rica is .015, and that the 
frequency of the disease gene in the population is .003. 
In the likelihood calculations, all possible disease-ge- 
notype combinations of all relatives are considered. With 
the model that was used, in which the disease-gene fre- 
quency is very low, the LD-likeiihood test, in most cases, 
treats the probands as effectively heterozygous at the 
disease locus, and chromosomes of other relatives not 
occurring in the probands are treated as non-disease- 
carrying chromosomes. We did not specify a phenocopy 
rate in the genetic model, because the effect of pheno- 
copies will be absorbed by the parameter X; the presence 
of phenocopies in our sample will serve to erode the 
association between marker alleles and disease and 
hence will reduce the estimate of X. Because, in the pre- 
sent LD study, we were attempting to gather further 
evidence regarding the findings published in our initial 
genome screen, we limited ourselves to this one model 
in performing the likelihood analyses. However, both 
the BP-I family sample and the current LD sample will 
ultimately be analyzed with use of other models. We 
considered as promising those markers that gave evi- 
dence of overrepresentation of an allele on affected chro- 
mosomes, with a -2lw(likelihood ratio [LR]) statistic 
>1.0. 

Follow-up genotyping and LD-analysis studies were 
performed on markers that gave suggestive findings in 
the shared-segment evaluation. Within each segment 
that passed the threshold described above, 1-3 addi- 
tional markers were typed to permit us to test for LD 
across regions of 1-2 cM. Markers that provided sug- 
gestive evidence of LD by the initial likelihood-ratio test, 
but had not been suggested as promising by the shared- 
segment Screen, were also followed up, in this case by 
typing two additional nearby markers. In all, a total of 
42 markers from chromosome 18 were used to genotype 
the study sample (table 1 and fig. 1). LD analysis of the 
additionally typed markers was conducted by use of the 
likelihood-ratio test. 

Results 

Simulations 

Simulation results .for the sample of 48 patients with 
BP-I and available relatives showed relatively high power 
to detect suggestions of association (P ^ .05) with low 
phenocopy rates. (94% for a phenocopy rate of 0%, and 
54% for a phenocopy rate of 33%) but a dramatically, 
decreased power under high phenocopy rates (e.g.,- 9 /o 
for a phenocopy rate of 67%). Additional simulations 
showed that, under higher phenocopy rates, the power 
to detect LD can be improved by increasing the sample 
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Figure 1 Results from the LD screen of chromosome 18. The 
26 markers used in the first stage of the screen are listed in the right 
column. Sixteen markers used to Mow up interesting regions are fisted 
in the left column. Approximate chromosomal locations of the 26 
initial markers and the 16 follow-up markers are indicated by long 
and short tick marks, respectively. The eight segments that passed the 
initial screen threshold for segment sharing (50% of individuals or. 
25% of chromosomes sharing a two-marker haplotype and the nve 
markers that passed the initial threshold for the Terwilliger LR test 
(-21«[LR] > 1.0) are indicated by blackened bars and asterisks, re- 
spectively. Two marker segments that passed the initial threshold were 
followed up by at least one marker within the segment, if possible : (at 
the time of the study no markers were available between D18S843 
and D18S464, and only one marker was available between DMS464 
and D18S378). Markers that passed. the initial threshold for »eTer- 
williger LR test were followed up with two additional markers. These 
additional markers flanked the original finding. The value of the 
-2bi(LR) statistic, from the Terwilliger test, is plotted as a solid bar. 
This statistic is distributed as a onesided x 1 random variable with one 
degree of freedom. The estimate of the \ value, for the eight markers 
with positive results, is indicated in parentheses after the -Z1»(LK) 
statistic. Markers without a -2ln(LR) statistic plotted had estimates 
of X = 0, with the exception of three markers that had estimates ot 
0<X<0.62. 

size and/or the marker density of screening (table 2). For 
instance, with a phenocopy rate of 75%, the power in- 
creases to 82% with a sample of 300 affected individuals 
and a 2.5-cM marker map. 
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Shared-Segment Screen 

We evaluated 25 possible shared segments (defined by 
the 26 markers genotyped in the sample). Eight regions 
passed the threshold of possible IBD sharing by s=50% 
of patients. These regions were bounded by the following 
markers: Di8S843-D18S464, D18S464-D18S378, 
D18S467-D18S474, D18S64-D18S60, D18S60- 
Dl 8S68, Dl 8S485-D 1 8S469, D 1 8S469-D 1 8S1009, and 
D18S1009-D18S461 (fig. 1). 

Linkage-Disequilibrium Testing 

Five (D18S59; D18S467, D18S61, D18S485, and 
D18S469) of the original 26 markers displayed evidence 
of possible LD, by means of a likelihood procedure 
(-21»[LRI statistic >1.0; table 3). Two (D18S59 and 
D18S61) of these five markers had not been identified 
as markers of interest by the shared-segment evaluation. 
D18S59, located near 18pter, displayed the strongest 
pointwise evidence for LD (-21«[LR] statistic of 8.3, 
? = ,002) of all the markers tested in this sample. 

Follow-up of Initial Results 

Using the protocol discussed in Samples and Methods, 
we genotyped additional markers within the segments 
that passed the shared-segment screen as well as follow- 
up markers surrounding one (D18S59) of the two mark- 
ers that had passed only the LD screen. We were unable 
to follow up one shared-segment region (D18S84j- 
D18S464), because additional polymorphic markers 
were not available within the segment. We were also 
unable to follow up the finding for D18S6^ £or same 
reason. Three (D18S476, D18S870, and D18S1U1) of 
the 16 follow-up markers typed displayed additional ev- 
idence of possible LD (fig. 1). 

These additional results brought to eight the total 
number of markers with -21«(LR) statistics >1.0 (table 
' 3). Five of these eight marker loci were clustered withm 
a small region of 18q22-23. The most significant LD m 
18q22-23 was observed at D18S.1121, with -21«(LR) 
of 5.03 and P = .01, and two were in ISpter. 
For the two 18pter markers (D18S59 and D18i>476), 

Table 2 

Power-of-Likelihoad-Analysis Test of LP _ 



Power to detect LD for 
Sample Size (n) = 



Penocopy Rate 



90 



200 



300 



400 



50% 
75% 



82% 
33% 



99% 
62% 



100% 
82% 



100% 
90% 



Norn-Assumptions included that subjects were removed _from a 
common ancestor by 10 generations, that a marker map * 2 -J ™* 
was used, and that each marker had four equally frequent alleles. 
Values are the percentage of replicates to have P values <.Ui. 



Escamilla et ah: LD Mapping of BP-I in Costa Rica 

Table 3 

Frequencies of Marker Alleles Overrepresented in Disease 
Chromosomes, as Compared with Nondisease Chromosomes, for 
Markers Where -2in(LR) >1.0 



Frequency on 



Marker 



Allele 



Nondisease 
Chromosomes 



Disease 
Chromosomes 



D18S59" 

D18S476 

D18S467* 

D18S61* 

D18S485* 

D18S870 

D18S469' 

D18S1121 



154 
271 
172 
177 
182 
179 
234 
168 



.121 
.470 
.384 
.074 
.237 
.405 
.128 
.171 



.572 
.771 
.693 
.326 
.586 
.657 
.450 
.553 



J Markers from the screening stage. 

the alleles overrepresented on BP-I chromosomes (154 
and 271 bp, respectively) form a haplotype that occurs 
in 48% of the patients with BP-L Overall, this haplotype 
occurs on 26% of the chromosomes of individuals with 
BP-I and on 4% of the chromosomes not transmitted 
from parents to individuals with BP-I (definite phase for 
these two markers could be assigned in 25 patients with 
BP-I [50 chromosomes] and 25 nontransmitted parental 
chromosomes). Because the composite genetic and phys- 
ical maps of the 18q22-23 region had not yet been com- 
pleted at the time of this study, the relative order of the 
five markers in 18q22-23, for which evidence of LD was 
observed, was still too uncertain to permit construction 
of definitive marker haplotypes in our study sample. 

Marker D18S467, in the 18ql2.3 region, was the one 
marker outside 18q22-23 and 18pter to show a 
-2I*(LR) >1 (-21«[LR] « 2.5, P - .06). The additional 
markers used to follow up this result (D18S450, 
D18S460, and D18S57) displayed no evidence of 
association. . .. . . 

Marker Heterozygosity in the Costa Rican Sample 

We calculated heterozygosity values for the markers 
used, on the basis of the allele frequencies, estimated 
from the entire sample, accounting for known relation- 
. ships among individuals. These heterozygosities are 
shown in table 1, along with the corresponding hetero- 
zygosity values of these markers in the CEPH popula- 
tion, used by Genethon. 

Discussion 

Screening for Complex Disease Loci by LD 
Approaches 

Our intention in this work was to explore the feasi- 
bility of using LD methods to screen the genome for 
susceptibility genes for a common, genetically complex 
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disorder. The results obtained in our LD-based search 
for possible BP-I gene-loci on chromosome 18 were en- 
couraging (specific susceptibility regions were sug- 
gested), but they highlight a number of issues that must 
be considered before LD screening is widely adopted. 

Successful application of a shared-segment approach 
to any LD study depends on (1) a marker-map density 
that is appropriate to the age of the population isolate 
being studied and (2) a sharing threshold that will not 
' be too high to allow true IBD areas to be identified and 
that will not be so low as to include many areas that 
are IBS false-positive signals. An appropriate marker 
map for an LD-screening study should have segments of 
a size expected to be shared IBD by many of the affected 
individuals. In addition to the density of the marker map 
used, the number of .generations separating affected in- 
dividuals from their common ancestor and the rate of 
etiologic heterogeneity in the population will also influ- 
ence the choice of the sharing threshold, used to trigger 
further study. For example, if the common (disease-gene 
bearing) ancestor is removed frorn the current descen- 
dants by >10 generations, the length of true IBD hap- 
lotypes shared by 5*50% of the descendants may be <5 
cM (and certainly <6 cM, as is the screen used in this 
study) (Te Meerman et al. 1994; Durham and Feingold 
1997). Our choice of a threshold of 50% of affected 
individuals sharing a possible haplotype therefore effec- 
tively meant that we were likely to identify only BP-I 
genes of a major dffect in this population (phenocopy 
rate approaching zero), and even then, only if the dis- 
tance from a common ancestor is not > -10 generations. 
Although this was probably too stringent a screen 
threshold, given the complex etiology of bipolar disor- 
der, the alternative we faced— reducing the threshold to 
a lower percentage of potential IBD sharing— would 
have drastically decreased the specificity (and hence the 
utility) of the screen. For instance, in this particular- 
study, lowering the threshold to a possible IBD haplo- 
type shared in 5*25% of the patients would have resulted 
in 24 of the 25 regions tested being determined as regions 
of interest. If, in future studies, definite phase infor- 
mation can be set for a greater proportion of the pro- 
bands (obtained from phasing information supplied by 
additional relatives) the "possible IBD" threshold will 
be more useful as a screening criterion at thresholds ap- 
proaching 25% sharing (almost one in five of the pa- 
tients with BP-I in the current study had no ^lative 
available for phase construction). Finally, regardless ot 
the threshold chosen, there is no widely accepted statis- 
tical test available to evaluate the significance of the 
number of shared haplotypes observed, although several 
statistical approaches are under development (reviewed 
by Kruglyak 1997). " - . ' 

The use of markers with low heterozygosity will in- 
crease the number of false-positive results m a shared- 
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segment screen, as some regions may pass the threshold 
because of IBS sharing of a common allele. For example, 
the four regions that passed our shared-segment screen, 
but gave no evidence of LD in the likelihood-ratio tests 
(D18S843-D18S464, D18S464-D18S378, D1SS64- 
D18S60, and D18S60-D18S68), included markers that 
had relatively low heterozygosities in the study popu- 
lation (D18S464, D1SS60, and D18S378; table 1). 

There are two ways to overcome the limitations of 
shared-segment analysis, as seen in this study. The first 
is to increase the density of markers in the initial screen 
(i.e., increase the proportion of BP-I individuals in whom 
a shared haplotype can be detected, thus decreasing the _ 
number of false-negative results). Second, future screen- 
ing studies may focus on individuals who have available 
parents (i.e., increase the number of patients for whom 
we can set phase, thus allowing the threshold to be low- 
ered in a meaningful way and decreasing the number of 
false-positive results). 

For a formal statistical test of LD, we used the like- 
lihood-ratio test rather than another frequently used 
method, the transmission disequilibrium test (TDT) 
(Spielman et al. 1993), because data from all 48 patients 
with BP-I could be used in the likelihood approach. Ef- 
fective use of the original TDT requires parental geno- 
types, which were unavailable for 20 of the 48 patients 
with BP-I. One potential source of false-negative results, 
in our application of the likelihood-ratio test for LD, is 
that it is dependent on the specific genetic model for the 
disease used in the analysis. For instance, the results of 
the likelihood analysis presented here are applicable only 
to transmission of dominantly inherited BP-I genes in 
the CVCR population. The power of the likelihood rest 
is also critically dependent on the polymorphism content 
of the markers tested and the density of the markers 
used for a screening analysis. 

Evaluation of Potential BP-I Loci on Chromosome 18 

Our previous linkage study of BP-I in two Costa Rican 
pedigrees had provided several possible localizations for 
BP-I, throughout the genome. Since the 48 Costa Rican 
patients in the present study (collected independently of 
the pedigree studies and with no known relation to the 
pedigree members) are descended from the same ances- 
tral population as the patients in those pedigrees (CR001 
and CR004; Freimer et al. 1996b), we had reasoned that 
LD could be pjesent in the population sample at markers 
surrounding any true BP-I loci identified in the pedigree 
study. LD screening of patients with BP-I, in this, pop- 
ulation, might also yield important BP-I loci that were • 
not identified in the pedigree study. Pedigree-based link- 
age studies involve selection of certain subsets (individ- 
ual families) of the population in which there is a clus- 
tering of affected individuals. In a complex disease, such 



studies may be useful in finding genes of large effect in 
those particular subsets, but they might not identify loci 
that are important in understanding the basis of the dis- 
ease in the general population. In this LD screen of chro- 
mosome 18 in Costa Rican patients with BP-I, two 
regions were highlighted as being of particular interest, 
and both regions correspond to segments highlighted in 
the previous pedigree studies from Costa Rica. 

We previously highlighted the 18q22-23 chromoso- 
mal region (Freimer et al. 1996a) because this area 
showed the strongest evidence suggestive of linkage in 
the two pedigrees of any region tested in a genome screen 
conducted with -500 microsatellite markers (Mclnnes 
et al. 1996). In the pedigree study, portions of a hap- 
lotype of >40 cM in this region were shared by 22 of 
26 individuals with BP-I (Freimer et al. 1996a), although 
formal LOD scores for markers in this area were below 
the level of significance required for proof of linkage. In 
the current study, five markers in the 18q22-23 region 
provide possible evidence of LD in Costa Rican patients 
with BP-I. The marker that gave the strongest evidence 
of possible LD in the current study, D18S1 121, is located 
within the 3-cM region of highest haplotype sharing ob- 
served in the individuals with BP-I from the pedigree 
study. The specific allele (of 168 bp), which is over- 
represented on the disease chromosomes at this locus 
(D18S1121) in the sample of the population with BP-I, 
is also the allele that occurs on the putative high-risk 
haplotype within the pedigrees (Freimer et al. 1996a). 

Our pedigree studies had also highlighted a region at 
18pter deserving of further study (Mclnnes et al. 1996), 
albeit in only one of the two families, CR001. The sec- 
ond-highest LOD score in the genome observed for fam- 
ily CR001 was at D18S59, located near 18pter, and a 
nearby marker, D18S476, also gave a positive LOD 
score in this family. This current study of 48 patients 
with BE-I now provides additionar evidence for a BP-I 
locus in this region, with the same two markers showing 
evidence of LD. Because genomewide significance levels 
have yet to be calculated for LD tests (Kruglyak 1997), 
we can at present only interpret the evidence for LD in 
the 18pter region (a pointwise P value of .002 for marker 
D18S59) as being roughly equivalent to Lander and 
Kruglyak's criteria for suggestive, but not significant, 
linkage in a genomewide screen (Lander and Kruglyak 
1995). The alleles at D18S59 and D18S476 that are 
overrepresented among the patients with BP-I, from the 
population sample (154 and 271 bp, respectively), are 
also overrepresented in the patients with BP-I from ped- 
igree- CR001 (all patients with BP-I in family CR001 
have at least one copy* of the, 154 allele at D18S59), 
possibly indicating that the patients with BP-I in the 
pedigree share this region IBD' with those 48% of pa- 
tients with 3P-I from the population sample who also 
carry this haplotype. 
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The third region that showed possible evidence of LD 
in our population sample was identified through a single 
marker (D18S467), in the 18ql2.3 region. Additiona 
markers typed near this one did not support the initial 
suggestion of LD in this region. Evidence from a linkage 
test that yielded a significance level of P = .06 would be 
expected to occur, by chance, -24 times (about once on 
most chromosomes) in a genomewide screen. 

Our possible BP-I localizations at 18pter and 18q22- 
23 in the current sample, are distinct from regions on . 
chromosome 18 suggested by other groups as bemg pos- 
sibly linked to mood disorder (Berretini et al. 1994; Sane 
et al. 1995). We detected no evidence of association with 
these areas (near the centromere and in 18q21) in our 
BP-I population sample; nevertheless, the power of our 
current sample is not great enough to rule out these 
regions as potential BP-I loci in the Costa Rican pop- 
ulation. McMahon et al. (1997) have recently reported 
excess allele sharing in sib pairs at 18S541, which is in 
the 18q22-23 region, although their affected status in- 
cluded not only BP-I, but also bipolar type II and schi- 
zoaffective patients. 

Future Directions 

The results of this study suggest that shared-seg- 
ment-screening approaches will only be useful with the 
development of denser marker maps (Collins et al. 19? t) 
and with the development of tests that permit statistical 
comparison of disease-chromosome haplotypes with 
control-chromosome haplotypes. Because the potential 
advantages of a shared-segment approach are substantial 
(this type of approach takes maximum advantage of the 
fact that haplotypes, not just single alleles, are inherited 
IBD in population isolates, and it is nonparametric), and 
because marker maps (Dib et al. 1996; Yuan et al. 1997) 
and statistical methods continue to improve, we remain 
optimistic about this method of mapping genes for com- 
plex disorders. . ... „ 
Both the 18pter and the 18q22-23 regions would have 
been identified as regions with possible LD at a signif- 
icance of P < .05, even if we had not used the shared- 
■ segment approach but had instead screened for evidence 
by using only the likelihood-ratio test, with the original 
26 markers. Our results indicate that, in population iso- 
lates, such as the CRCV, and with suitably dense marker 
coverage, tests similar to the likelihood-ratio test ot lu 
(Terwilliger 1995) are promising tools for genome 
screening of complex diseases. It is not clear, however, 
whether currently available tests will be powerful 
enough to detect unequivocal proof of association m a 
genomewide scan for such diseases, given sample sizes 
that are easily obtained. More-powerful tests are needed 
and may emerge from efforts to develop measures that 
make use of haplotype information (Service et al. 19?? 
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[in this issue]; Durham and Feingold 1997; Goldir, and 
Chase 1997). 

The test of LD screening conducted in the current 
study points out the need to do a more complete LD 
screening analysis. We thus intend to perform an LD 
screen of chromosome 18, using an expanded sample of 
patients with BP-I and a denser marker map. The ad- 
dition of more-polymorphic markers to genome maps, 
and the application of haplotype-based statistical tests 
currently under development, should facilitate efforts to 
definitively identify BP-I susceptibility genes in Costa 
Rica. 
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individuals. The results of this chromo- 
some-wide analysis are instructive for gen- 
ome-wide LD mapping in isolated 
populations* Furthermore, the analysis con- 
tinues to support a possible BP-X locus on 
18pter, suggested by previous analyses in 
this population. Evidence for a possible BP-I 
locus on 13ql2.2 is also described. 
© 2001 Wiley-Liss, Inc. 

KEY WORDS: complex phenotype; popula- 
tion isolate; association 
mapping 



INTRODUCTION 

There is considerable current interest in the possibi- 
lity of locating susceptibility genes for complex traits by 
evaluating linkage disequilibrium (LD) between such 
traits and densely mapped sets of DNA markers [Risch 
and Merikangas, 1996]. Several lines of evidence 
suggest that LD mapping of complex traits may be 
most successful when performed in samples from 
genetically isolated populations that have expanded 
dramatically in size in the relatively recent past 
[Wright et al., 1999], although some authors have 
contested this suggestion on theoretical grounds [Ter- 
williger and Weiss, 1998], We have focused on using LD 
approaches to map susceptibility genes for severe 
bipolar mood disorder (BP-I) in samples of hospitalized 
patients drawn from the genetically isolated population 
of the Central Valley of Costa Rica (CVCR). BP-I is an 
excellent trait in which to test such approaches, * as 
there is strong evidence for its genetic basis fReus and 

© 2001 Wiley-Liss, Inc. 



Linkage disequilibrium (LD) methods offer 
•great promise for mapping complex traits, 
but have thus far been applied sparingly. In 
this paper we describe an LD mapping study 
of severe bipolar disorder (BP-I) in the 
genetically isolated population of the Cen- 
tral Valley of Costa Rica. This study pro- 
vides the first complete screen of a 
chromosome for a complex trait using LD 
mapping and presents the first application 
of a new LD mapping statistic (ancestral 
haplotype reconstruction (AHR)) that eval- 
uates haplotype sharing among affected 
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Freimer, 1997], yet mapping studies employing stan- 
dard linkage approaches have provided equivocal 
results in localizing susceptibility genes for this 
disorder and no BP-I predisposition genes have yet 
been identified [Potash and DePaulo, 2000]. 

In a previous study [Escamilla et al., 1999] we 
assessed the feasibility of two different LD methods 
for mapping BP-I, a search for shared segments 
[Houwen et al., 1994] and a single marker likelihood 
test [Terwilliger, 1995]. In that study we genotyped a 
sample of 48 individuals and available relatives drawn 
from the CVCR using 25 markers spaced at approxi- 
mately 6-cm intervals along chromosome 18. Prom 
simulations reported in that paper, we determined , that 
for reasonable power to detect LD with a single-point 
likelihood test, a larger sample size and more dense 
spacing of markers would be necessary than was used 
in our initial study. We also recognized that detecting 
LD through the analysis of haplotype sharing in such a 
sample would require a substantial number of relatives 
to set the phase for patient chromosomes, as well as a 
more effective test than a simple comparison of shared 
segments. In the current study we screened for LD on 
chromosome 18 at a higher marker density, using a 
larger sample of BP-I patients (and with more relatives 
available for genotyping), and employed a newly 
developed statistical test (termed ancestral haplotype 
reconstruction (AHR)), designed to find LD via haplo- 
type analysis in samples drawn from population 
isolates [Service et al., 1999]. This study provides the 
first implementation of AHR, which serves as a 
complementary test to the likelihood-based LD analysis 
method of Terwilliger [1995] (abbreviated here as LD- 
T). The current screen incorporated a two-phased 
approach; regions of the chromosome that showed 
possible LD in an initial phase of genotyping were 
investigated at higher marker density in a larger 
sample of patients and relatives. The results of this 
two-phased chromosome 18 LD screen are reported 
here. 

METHODS 

Sample Collection 

All subjects were recruited in accordance with the 
principles of the Declaration of Helsinki and with 
approval from the Institutional Review Boards of the 
University of California at San Francisco and the 
University of Costa Rica (UCR). All probands had a 
definite diagnosis of BP-I (based on best-estimate 
diagnostic procedures, as described previously [Frei- 
mer et al., 1996a]) with onset by age 50 and a history of 



at least two psychiatric hospitalizations. Probands 
were recruited independently from one another from 
psychiatric hospitals and clinics in the CVCR. First- 
degree relatives of probands were also recruited where 
possible to permit determination of genetic phase (for 
further information on ascertainment and diagnostic 
procedures, see Escamilla et al. [1996, 1999]). The 
sample for the phase I screening included 19 probands 
with two parents available for genotyping, 33 with one 
parent available, 9 with children available, 3 with 
children and spouse available, 1 with siblings available, 
and 4 with no relatives available. The sample for the 
phase II screening included 62 probands with two 
parents available, 102 with one parent available, 15 
with children available, 7 with children and spouse 
available, 19 with siblings available, and 22 with no 
relatives available. 

For the LD-T analyses, different weights were given 
to the genotype results of each proband, based on the 
degree to which his/her ancestry (at the great-grand- 
parental level) was known to derive from the CVCR. 
The distribution of CVCR ancestries for probands in the 
phase I (iV=69) and phase II {AT = 227) samples are 
shown in Table L The phase I and phase II samples 
were equivalent in the proportions of ancestry derived 
from the CVCR (Table I). 

A control sample was recruited from undergraduate 
students at the UCR. As there are very low tuition costs 
at the UCR, the largest public university in the country, 
the student body covers virtually the entire social 
spectrum of Costa Rica. We sampled 26 students and 
both their parents (using the latter 52 individuals as 
the controls7 with the phase set from the students' 
genotypes). Controls had all eight great-grandparents 
born in the CVCR. Although, given the age of under- 
graduate students, not all individuals may be past the 
typical age of onset for BP, we believe they represent a 
reasonable population sample. They were not sampled 
with the anticipation of being free of the phenotype 
studied, but with the expectation that they would be 
representative of allele frequencies in the general 
population of the CVCR. As the statistical analyses 
employed use estimates of population allele frequen- 
cies, addition of these control chromosomes provides a 
more accurate estimate of these frequencies in the 
population. 

Initial Genotyping 

For this study we chose 41 microsatellite markers of 
the highest heterozygosity available to cover chromo- 
some 18 at approximately 3~cm intervals [Broman et al., 



TABLE I. Distribution of CVCR Ancestry for Probands in the Phase I and Phase II Samples 



# of GGP from 



CVCR 


2 


3 


4 


5 


Percent (iV) in 


0 


1 


12 


4 


Phase I sample 


(0) 


(1) 


(8) 


(3) 


Percent (N) in 


1 


2 


13 


2 


Phase II sample 


(3) 


(5) 


(30) 


(5) 



6 


7 


8 


16 


7 


59 


(11) 


(5) 


(41) 


13 


8 


59 


(30) 


(19) 


(135) 



GGP, great-grandparents. 



1998]. The markers were obtained from the Genethon 
and Cooperative Human Linkage Center (CHLC) sets 
(see ^ www.genethon.fr and lpg.nci.nih.gov/CHLC, res- 
pectively). We genotyped these markers in a sample of 
69 BP-I individuals and their available relatives 
(N~ 162) and in the control sample. The genotype data 
were analyzed using the LD-T and the AHR. The 
results of these analyses are shown in Table II. 
Genotyping was either semiautomated using an ABI 
377 apparatus for markers, for which an assay for this 
apparatus, was already available (22 markers), or by 
radioactive labeling using a previously described pro- 
tocol [Bull et al, 1999] (19 markers). 

Follow-up Genotyping 

Five chromosome segments that included markers 
that passed a predetermined threshold for further 
investigation were followed up in a larger sample 
(phase II) from the CVCR (227 BP-I individuals and 
their available relatives and CVCR controls). Given the 
uncertainty regarding the power of LD analysis for 
such screening, we set a low threshold for considering 
regions to be potentially interesting (for LD-T, X > 0.25 
or P<0.05; for AHR, P< 0.05). Regions were followed 
up using additional markers flanking the original 
marker at a distance of about 1 cm, where such 
markers were available, otherwise at the nearest 
available markers. For locations where more than one 
marker was available, we chose the marker with the 
highest heterozygosity. These markers were all from 
the Genethon and CHLC collections, with the exception 
of savaS [Vocero-Akbani et al., 1996]. For the phase II 
markers, genotyping was either automated (2 markers) 
or manual (16 markers), according to the protocols 
described above. The markers in each segment were 
analyzed using both LD-T and AHR. 

Analyses 

All analyses were performed both with and without 
utilizing the Costa Rican control sample described 
above. Addition of the control sample provided more 
accurate estimation of population allele frequencies. 
We applied a modified version of the LD-T for associa- 
tion, first proposed by Terwilliger [1995], to each 
individual marker independently (see Escamilla et al. 
[1999] and Freimer et al. [1996b] for further details on 
the modifications to the LD-T). The LD-T assesses the 
likelihood that a particular allele is overrepresented on 
disease chromosomes, compared to nondisease chromo- 
somes, and provides a quantitative estimate of this 
overrepresentation in the form of a single parameter, X 9 
for each marker. We also applied the AHR test on 
windows of three markers. AHR compares the distribu- 
tion of haplotypes in affected, individuals with the 
distribution expected for individuals bearing a disease 
mutation descended from a common ancestor. Three 
parameters are estimated under the alternative: the 
time since a common founder (g), the percentage of 
chromosomes in affected individuals to have descended 
from this founder (alpha), and the position of the 
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disease locus ix). The likelihood was evaluated at five 
steps (estimates of x) between each marker, at 15 
estimates of g, ranging from 10 to 1,000, and at 50 
estimates of alpha, ranging from 0.02 to 1.0. While the 
population history of the CVCR suggests founding in 
the 16th and 17th centuries, it is possible that disease 
mutations shared by patients predate the founding of 
the population. Furthermore, similar distributions for 
the expected haplotype counts of affected individuals 
can be obtained at different combinations of g and 
alpha, indicating differing explanations for the same 
data. AHR was modified from the form presented in 
Service et al. [1999] to include linkage disequilibrium 
between markers under the null, as suggested by 
McPeek and Strahs [1999]. When a maximum like- 
lihood was found with this range of g, we examined 
smaller increments of possible g t to further refine the 
maximum. 

Heterogeneity 

Heterogeneity in marker allele frequencies between 
the initial set of probands used in the phase I study and 
the additional individuals recruited for the phase II 
study was assessed as follows: Allele frequencies were 
estimated separately in phase I (N~ 69 probands) and 
phase II (N—158 probands), and then in the two sets 
combined (iV— 227 probands). Under the null hypoth- 
esis of no heterogeneity in allele frequencies between 
phase I and phase II samples, the difference in the -2x 
log likelihood of the combined set and the sum of the 
-2x log likelihoods of the separate sets should be 
distributed as a chi-square random variable with m — 1 
degrees of freedom, where m is the number of alleles at 
the marker being tested. 

RESULTS 

Phase I Screening 

In this phase we genotyped 69 affected individuals 
from the CVCR and 52 control individuals from the 
same population for 41 microsatellite markers. We 
assessed LD via the LD-T and by AHE; AHR used only 
the 55 patients with one or both parents available. The 
markers that exceeded these thresholds for the LD-T 
analysis are indicated in Table II; no marker sets 
exceeded the threshold for AHR. 
'/ 

Phase II Screening 

In this phase we followed up the potentially inter- 
esting regions from the phase I screen by genotyping an 
expanded sample of BP-I patients and relatives in any 
marker that snowed a signal in the phase I screen, as 
well as in two additional flanking markers (at 1 cm 
distance, if a marker was available; otherwise we typed 
the nearest available highly polymorphic microsatellite 
marker). Not all markers were found on one genetic 
map. Our primary resources were the Marshfield " 
integrated map [Broman et al, 1998] and the Genethon 
map [Dib et al., 1996]. If a marker could not be found on 
either of these maps, we attempted to combine 
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TABLE II. 


Markers to Exceed the Threshold of a > 0.25 or P 


< 0.05 in the 


Phase I Portion of the Study 










Trios only 






Trios plus 
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Associated 








Associated 


Marker 


/v 


r 


P-value 


allele 


A. 


r 


P-value 


allele 


D18S59 


0.48 


9.18 


0.0012 


154 


0.45 


9.28 


0.0011 


154 


D18S1105 










0.26 


2.55 


0,055 


101 


D18S1163 


0.57 


2.54 


0.055 


335. 


0.43 


1.12 


0.145 


335 


D18S467 


0.53 


4.73 


0.014 


172 


0.4S 


4.27 


0.019 


172 


D18S469 


0.34 


2.84 


0.046 


234 


0.26 


0.74 


0.195 


234 


D18S1141 










0.31 


0.95 


0.165 


275 



•Trios only' indicates results from the 69 probands and their available relatives iN « 162). 'Trios plus controls' indicates results including the addition of 
genotypes from the control individuals l/V- 52). 



information from other genetic and physical maps in 
public databases to make a composite map. Our follow- 
up areas were in five different regions (Fig. 1). Interval 
1 markers were placed on a composite map (savaS- 
a5cM-D18S59-0.3cM-D18Sl231-I^cM-D18SH05- 
L5cM~CHLC.GATA166D05). The markers in interval 2 
were all found only on the Marshfield map (D18S967- 
4.4cM-D18Sll63^.0cM-D18S843). The markers in 
interval 3 were found on both the Genethon and 
Marshfield maps. The total area covered by the three 
markers in interval 3 was approximately the same in 
the two maps; however, the intermarker distances were " 
different (Genethon: D18S1157-I.5cM-D18S467- 
2.3cM-D18S450; Marshfield: D18SH57~<sUcM- 
D18S467-0cM-D18S45O). As the differences in inter- 
marker distances between D18S1157 and D18S450 
were fairly substantial, we did separate analyses with 
both maps. Interval 4 was made using a composite map 
(D18S870-2.0cM-D18S469-0.JcM-D18S879). All three 
markers in interval 5 were on both Genethon and 
Marshfield maps, with nearly the same genetic dis- 
tances between markers (D18S1122-2.5cM-D18S1141- 
L9cM-D18S70). 

The dense marker set was genotyped in a sample of 
227 BP-I patients and their relatives (and in a control 
sample consisting of 52 unrelated individuals from the 
CVGR). The data were analyzed using both the LD-T - 
and AHR. For AHR, only 164 individuals were used, 
i.e., those with at least one parent genotyped. For the 
LD-T, all of the markers in intervals 1, 2, and 4 (sava5, 
D18S59, D18S1231, D18S1105, 166d05, D18S967, 
D18S1163, D18S843, D18S870, D18S469, and 
D18S879) and two of the markers in intervals 3 and 5 
(D18S1157 and D18S1122) resulted in an estimate of X 
of zero. The LD-T results for the remaining markers in 
intervals 3 and 5 (D18S467, D18S450, D18S1140, and 
D18S70) are in Table III. For AHR (Table IV), the 
strongest evidence of LD in the phase II screen was in 
regions including the markers that had shown the most 
evidence for LD in the phase I screen (D18S59 near 
18pter and D18S467 in 18ql2). The AHR test suggests 
possible BP-I localizations between D18S59 and 
D18S1231 (peak lod score of 1.52, equivalent to 
P = 0.008) and between D18S467 and D18S450 (peak 
lod score of 1.89, equivalent to P~ 0.003 using Marsh- 
field map distances; peak lod score of 2.03, P = 0.002 
using Genethon map distances). Additionally, the 



alleles associated with BP-I at the markers displaying 
LD in both the phase I and phase II studies were the 
same (154 bp atD18S59 and 172 bp at D18S467 and 275 
bp at D18S1141), as were the alleles associated with 
BP-I at the markers displaying LD using both the LD-T 
and AHR methods in the phase II screen. 



Interval 2 markers 
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Fig. 1. Ideogram of Chromosome 18, showing the five follow-up 
intervals from Phase II of the study. 





Interval 4 markers 



Interval 5 markers 
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TABLE III. Markers in the Phase II Portion of the Study to result in Non-Zero Estimates of /. 



Trios only Trios plus controls 



Marker 




x 2 


P- value 


Associated 
allele 


/, 


x 2 


P-value 


Associated 
allele 


Interval 3 


















D18S467 


0.64 


5.93 


0.007 


172 


0.61 


8.79 


0,0015 


172 


D18S450 










0.37 


3.40 


0.033 


204 


Interval 5 


















D18S1141 


0.51 


2.07 


0.075 


275 


0.43 


1,99 


0.079 


275 


D18S70 


0.17 


1.77 


0.09 


114 











Trios only' indicates results from the 227 probands and their available relatives <N ~ 563). Trios plus controls' indicates results including the addition of 
genotypes from the control individuals UV = 52). 



Heterogeneity Analysis 

We performed this analysis to evaluate whether 
heterogeneity in allele frequencies between sets of 
subjects sampled over the time course of this study 
could explain varying results in the different LD-T 
analyses, in particular at marker D18S59 (between the 
set of 69 individuals in phase I of the current study and 
the 227 individuals in phase II of the current study). 
The heterogeneity method applied here tested for allele 
frequency differences in the first set of 69 individuals 
and the second set of 158 individuals. This test failed to 
reject the null hypothesis of homogeneity considering 
all markers in the phase II study in a combined test. 

DISCUSSION 

In this study we have extended our previous efforts to 
develop LD-based approaches for genome screening to 
map susceptibility genes for complex traits [Escamilla 
et al M 1999]; as in our previous study, we used the 
entire chromosome 18 as a test case. In particular, we 
applied two different forms of LD analyses to a data set 
of BP-I patients and relatives, using for the first time a 
recently developed approach (AHR) based on recon- 



structing ancestral haplotypes in isolated populations 
[Service et al., 1999], Two of the regions suggestive of 
LD with BP-I in our previous study (at 18pter and 
18ql2.2-12.3), using a smaller sample [Escamilla et ah, 
1999], continue to suggest LD in the present study, 
using more densely spaced markers and a larger 
sample size, while a third previously interesting region 
(18q22-23) shows negligible evidence of LD in the 
current total sample. 

Evidence for LD in the 18pter region (surrounding 
D18S59) was observed at all stages of our prior and 
current studies. However, the level of support for LD in 
this region varied between the initial and subsequent 
samples and with the type of test used. In our prior 
study and in phase I of the current study, LD was 
detected using LD-T but not with haplotype approaches 
(shared segment evaluation and AHR, respectively), 
while in phase II of the current study, LD was detected 
with AHR but not with LD-T. This difference does not 
reflect heterogeneity between the sample sets, at least 
as indicated by a formal test for heterogeneity with the 
markers we have examined. Additionally, regardless of 
the particular test showing evidence for LD, the same 
allele at D18S59 has been associated with the disease 



TABLE IV. AHR Itesults From Tnree Marker Intervals Prom the Phase II Portion of the Study 



3 marker interval and genetic distances 


Peak lod 
score 


Location of peak score 
(recombination fraction) 


Estimate 
ofg 


Estimate 
of ot 


Associated 
haplotype 


sava5-0.3cM-m8S59-0.3cM-Dl8Sl231 


0.89 


0.0018 from 18S59 


298 


12% 


235-154-10 


With coityrols 


0.95 


0.0012 from 18S59 


296 


12% 


235-154-10 


D18S59^.3cM-D18S1231-1.4cM-D18S1105 


1.33 


At 18S59 


71 


8% 


154-10-85 


With coritrols 


1.52 


0.0019 from 18S59 


96 


10% 


154-10-85 


Dl8S123i-1.4cM-D18S1105-1.5cM-166D05 


0.33 


0.0118 from 18S1105 


6 


2% 


20-85-308 


With controls 


0.37 


0.0118 from 18S1105 


5 


2% 


20-85-308 


Dl8S967-4.4cM-D18S1163-4.0cM-Dl8S843 


0.41 


At 18S967 


13 


10% 


220-202-178 


With controls 


0.63 


0.0087 from 18S967 


14 


14% 


220-202-178 


Genethon 










128-172-204 


Dl8SH57-1.8cM-Dl8S467-2.3cM-D18S450 


0.53 


0.009 from 18S467 


82 


28% 


With controls 


2.04 


0.009 from 18S467 


92 


42% 


128-172-204 


Marshfield 












Dl8Sll57-4.4cM-Dl8S467-0.2cM-D18S450 


0.19 


0.0016 from 18S467 


19 


6% 


128-172-204 


With controls 


1.85 


0.00079 from 18S467 


1,000 


40% 


128-172-204 


Dl8SS70-2.0cM-D18S469-0.1cM-D18S879 


0.04 


0.0079 from 18S870 


1,000 


12% 


179-236-236 


With controls 


0.18 


0.0079 from 18S870 


1,000 


26% 


179-236-234 


D18S1122-1.5cM-Dl8S1141-1.9cM-D18S70 


0.24 


0.015 from 18S1141 


5 


2% 


136-275-124 


With controls 


0.19 


0.015 from 18S1141 


5 


2% 


136-275-124 



'With controls' indicates the AHR results from addition of the 52 control individuals to the sample of 164 probands with at least one parent genotyped. 
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phenotype (see Tables II and IV). This associated allele 
at D18S59 was also observed previously in all BP-I 
patients from an extended Costa Rican pedigree 
[Escamilla et aL, 1999], However, the variability in 
evidence for a disease locus near D18S59 may reflect 
etiologic heterogeneity in the study sample. Indeed, the 
AHR tests estimated alpha for D18S59 (the proportion 
of disease chromosomes descended from a common 
ancestor) to be as low as 10%. 

The evidence for possible LD in the 18q22-23 region 
has diminished considerably in the current study, 
compared to earlier work. A linkage study of Costa 
Rican pedigrees had previously suggested a possible 
BP-I predisposition locus [Freimer et al., 1996b; 
Mclnnes et al., 1996] in the 18q22-23 region; LD 
screening using 48 BP-I patients from the CVCR had 
also implicated this region [Escamilla et aL, 1999]. 
While two markers in 18q23 satisfied the criteria in 
phase I of the current study for further evaluation in 
phase II, LD-T and AHR P values for all markers tested 
in 18q23 were clearly nonsignificant in both phases. 
These results do not negate the linkage and haplotype 
evidence from the extended Costa Rican pedigrees 
suggesting a BP-I locus in 18q23, although they may 
suggest that such a locus does not play a major role in 
risk for BP-I in this population; further evaluation of 
BP-I in this region should therefore focus on the 
extended pedigrees. In both 18p and 18q23, evidence 
for association has not increased with an increase in 
sample size. For complex traits with high heterogene- 
ity, it is difficult to predict to what degree a follow-up 
sample will replicate an original true linkage finding 
[Suarez et al., 1994]. It is likely that under such a high 
degree of etiologic heterogeneity, association results 
such as ours are particularly sensitive to small changes 
in the composition of a study sample. 

In contrast to results on 18q23, evidence for a 
possible BP-I locus in the 18ql2.2~3 region has 
increased at successive stages of our pedigree and LD 
studies, and evidence for a possible BP locus in this 
region has been previously reported by McMahon et al. 
[1997] in a North American sample (although their 
definition of affected status was wider than that 
employed in our study). In our genome screen of the 
pedigree sample from Costa Rica [Mclnnes et al., 1996], 
one of the markers in this interval (D18S450) resulted 
in a lod score of 1.08 (at a recombination fraction of 0.1) 
in one of the kindred studied. In our first LD study of 48 
BP-I patients from the CVCR [Escamilla et'al, 1999], 
D18S467, from this interval, was associated with BP-I 
using the LD-T. In phase I of the current study, in 
which the sample was expanded from Escamilla et al. 
[1999] by 21 BP-I patients plus additional relatives, the 
evidence for association at D18S467 increased, with 
P = 0.014rln phase II of the current study, with the 
sample of BP-I patients expanded to a total of 227, the 
evidence for LD at this locus increased further 
(P = 0.007 using the LD-T without controls and 
P = 0.0015 with controls). Moreover, one of the markers 
near D18S467 chosen for phase II of this study also 
showed evidence of LD (with the LD-T test) at a P value 
of 0.033 in the total sample. Finally, this region also 



shows suggestive evidence of LD using the AHR test in 
the total sample, at a significance level equivalent to a P 
value of 0.003 (Marshfield map) or 0.002 (Genethon 
map). 

The overall evidence from the AHR analyses in this 
region is similar regardless of which map is used; 
however, the interpretation of the parameters is quite 
different. When virtually no recombinations are allow- 
ed between D18S467 and D18S450, the observed haplo- 
typic diversity in the sample of patients can be 
explained either by a very ancient founder 1,000) 
or by a more recent founder (# = 19), but with few 
disease chromosomes descended from this founder 
(alpha ==6%). When more recombination is allowed 
between D18S467 and D18S450, the estimate of time 
since a common founder is between the estimates above 
(average g = 87). The interpretations of the parameters 
are not entirely independent of each other or of the map 
used in the analyses. Furthermore, the likelihood 
surface around the maximum likelihood estimates is 
relatively broad, with other combinations of g and 
alpha having similar lod scores. The differences in the 
Marshfield and Genethon maps over this small region 
are not surprising; it is very difficult to accurately 
assess genetic distance over regions of less than a few 
centimeters, given the practical limitations on sample 
sizes used for constructing genetic maps. Further study 
is warranted to systematically assess the influence of 
such map inaccuracies on LD mapping approaches. 

This study provides the first use of AHR and a direct 
comparison of AHR and LD-T. The results of the two 
tests are not completely comparable, as the sample for 
AHR testing~was smaller than the sample for LD-T 
testing (as LD-T does not require that all patient 
chromosomes be phase known). In our previous 
comparison of these methods by simulation studies 
[Service et aL, 1999], we showed that AHR was more 
powerful than LD-T in conditions of high etiologic 
heterogeneity, but that the difference in power between 
the methods is much less for low heterogeneity or a very 
old disease mutation (i.e., where it is unlikely that a 
conserved haplotype could be observed without very 
densely spaced markers). Our prior simulation compar- 
isons of AHR and LD-T predicted that where alpha is 
high, both tests will be equally powerful in detecting 
association, whereas when alpha is low, AHR is 
potentially much more powerful than LD-T. These 
predictions may be reflected in our current data. For 
example, at D18S467, alpha is high for most analyses, 
and both methods detect an association of similar 
magnitude. In contrast, at D18S59, alpha is low, and 
AHR suggests association but LD-T does not>The 
failure to detect association with AHR in the phase I 
genotyping study likely reflects low power from the 
very small sample suitable for AHR testing, as well as 
the wide spacing of the markers. These comparisons 
require an important caveat, namely that none of the 
association results reported here meet unequivocal 
thresholds for statistical significance; therefore, it is 
not possible to state, based on this data, how the tests 
perform in locating a deEnitive gene predisposition 
locus for BP-L Further evaluation of these chromoso- 
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mal regions with larger samples and additional mar- 
kers will be required to definitely prove whether BP-I 
predisposition genes are located at these sites on 
chromosome 18 and to gain a more clear assessment 
of the power of the AHR and LD-T approaches for gene 
mapping of complex traits in population isolates. 
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